Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Genes (Basel) ; 14(9)2023 09 14.
Artigo em Inglês | MEDLINE | ID: mdl-37761941

RESUMO

Biomarker-based cancer identification and classification tools are widely used in bioinformatics and machine learning fields. However, the high dimensionality of microarray gene expression data poses a challenge for identifying important genes in cancer diagnosis. Many feature selection algorithms optimize cancer diagnosis by selecting optimal features. This article proposes an ensemble rank-based feature selection method (EFSM) and an ensemble weighted average voting classifier (VT) to overcome this challenge. The EFSM uses a ranking method that aggregates features from individual selection methods to efficiently discover the most relevant and useful features. The VT combines support vector machine, k-nearest neighbor, and decision tree algorithms to create an ensemble model. The proposed method was tested on three benchmark datasets and compared to existing built-in ensemble models. The results show that our model achieved higher accuracy, with 100% for leukaemia, 94.74% for colon cancer, and 94.34% for the 11-tumor dataset. This study concludes by identifying a subset of the most important cancer-causing genes and demonstrating their significance compared to the original data. The proposed approach surpasses existing strategies in accuracy and stability, significantly impacting the development of ML-based gene analysis. It detects vital genes with higher precision and stability than other existing methods.


Assuntos
Neoplasias , Transcriptoma , Transcriptoma/genética , Perfilação da Expressão Gênica , Algoritmos , Benchmarking , Análise por Conglomerados , Neoplasias/diagnóstico , Neoplasias/genética
2.
Biomed Res Int ; 2022: 1776082, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35127939

RESUMO

BACKGROUND: Medulloblastoma (MB) is the most occurring brain cancer that mostly happens in childhood age. This cancer starts in the cerebellum part of the brain. This study is designed to screen novel and significant biomarkers, which may perform as potential prognostic biomarkers and therapeutic targets in MB. METHODS: A total of 103 MB-related samples from three gene expression profiles of GSE22139, GSE37418, and GSE86574 were downloaded from the Gene Expression Omnibus (GEO). Applying the limma package, all three datasets were analyzed, and 1065 mutual DEGs were identified including 408 overexpressed and 657 underexpressed with the minimum cut-off criteria of ∣log fold change | >1 and P < 0.05. The Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and WikiPathways enrichment analyses were executed to discover the internal functions of the mutual DEGs. The outcomes of enrichment analysis showed that the common DEGs were significantly connected with MB progression and development. The Search Tool for Retrieval of Interacting Genes (STRING) database was used to construct the interaction network, and the network was displayed using the Cytoscape tool and applying connectivity and stress value methods of cytoHubba plugin 35 hub genes were identified from the whole network. RESULTS: Four key clusters were identified using the PEWCC 1.0 method. Additionally, the survival analysis of hub genes was brought out based on clinical information of 612 MB patients. This bioinformatics analysis may help to define the pathogenesis and originate new treatments for MB.


Assuntos
Neoplasias Cerebelares , Meduloblastoma , Biomarcadores , Neoplasias Cerebelares/genética , Biologia Computacional/métodos , Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica/genética , Humanos , Meduloblastoma/genética , Mapas de Interação de Proteínas/genética
3.
Biomed Res Int ; 2022: 5908402, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35071597

RESUMO

Esophageal carcinoma (EsC) is a member of the cancer group that occurs in the esophagus; globally, it is known as one of the fatal malignancies. In this study, we used gene expression analysis to identify molecular biomarkers to propose therapeutic targets for the development of novel drugs. We consider EsC associated four different microarray datasets from the gene expression omnibus database. Statistical analysis is performed using R language and identified a total of 1083 differentially expressed genes (DEGs) in which 380 are overexpressed and 703 are underexpressed. The functional study is performed with the identified DEGs to screen significant Gene Ontology (GO) terms and associated pathways using the Database for Annotation, Visualization, and Integrated Discovery repository (DAVID). The analysis revealed that the overexpressed DEGs are principally connected with the protein export, axon guidance pathway, and the downexpressed DEGs are principally connected with the L13a-mediated translational silencing of ceruloplasmin expression, formation of a pool of free 40S subunits pathway. The STRING database used to collect protein-protein interaction (PPI) network information and visualize it with the Cytoscape software. We found 10 hub genes from the PPI network considering three methods in which the interleukin 6 (IL6) gene is the top in all methods. From the PPI, we found that identified clusters are associated with the complex I biogenesis, ubiquitination and proteasome degradation, signaling by interleukins, and Notch-HLH transcription pathway. The identified biomarkers and pathways may play an important role in the future for developing drugs for the EsC.


Assuntos
Carcinoma , Neoplasias Esofágicas , Biomarcadores Tumorais/genética , Biomarcadores Tumorais/metabolismo , Carcinoma/genética , Biologia Computacional/métodos , Bases de Dados Genéticas , Neoplasias Esofágicas/genética , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica/genética , Ontologia Genética , Redes Reguladoras de Genes/genética , Humanos , Mapas de Interação de Proteínas/genética
4.
Comput Biol Med ; 139: 104985, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34735942

RESUMO

Cervical cancer (CC) is the most common type of cancer in women and remains a significant cause of mortality, particularly in less developed countries, although it can be effectively treated if detected at an early stage. This study aimed to find efficient machine-learning-based classifying models to detect early stage CC using clinical data. We obtained a Kaggle data repository CC dataset which contained four classes of attributes including biopsy, cytology, Hinselmann, and Schiller. This dataset was split into four categories based on these class attributes. Three feature transformation methods, including log, sine function, and Z-score were applied to these datasets. Several supervised machine learning algorithms were assessed for their performance in classification. A Random Tree (RT) algorithm provided the best classification accuracy for the biopsy (98.33%) and cytology (98.65%) data, whereas Random Forest (RF) and Instance-Based K-nearest neighbor (IBk) provided the best performance for Hinselmann (99.16%), and Schiller (98.58%) respectively. Among the feature transformation methods, logarithmic gave the best performance for biopsy datasets whereas sine function was superior for cytology. Both logarithmic and sine functions performed the best for the Hinselmann dataset, while Z-score was best for the Schiller dataset. Various Feature Selection Techniques (FST) methods were applied to the transformed datasets to identify and prioritize important risk factors. The outcomes of this study indicate that appropriate system design and tuning, machine learning methods and classification are able to detect CC accurately and efficiently in its early stages using clinical data.


Assuntos
Neoplasias do Colo do Útero , Algoritmos , Análise por Conglomerados , Detecção Precoce de Câncer , Feminino , Humanos , Aprendizado de Máquina , Aprendizado de Máquina Supervisionado , Neoplasias do Colo do Útero/diagnóstico
5.
J Genet Eng Biotechnol ; 19(1): 43, 2021 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-33742334

RESUMO

BACKGROUND: Worldwide, more than 80% of identified lung cancer cases are associated to the non-small cell lung cancer (NSCLC). We used microarray gene expression dataset GSE10245 to identify key biomarkers and associated pathways in NSCLC. RESULTS: To collect Differentially Expressed Genes (DEGs) from the dataset GSE10245, we applied the R statistical language. Functional analysis was completed using the Database for Annotation Visualization and Integrated Discovery (DAVID) online repository. The DifferentialNet database was used to construct Protein-protein interaction (PPI) network and visualized it with the Cytoscape software. Using the Molecular Complex Detection (MCODE) method, we identify clusters from the constructed PPI network. Finally, survival analysis was performed to acquire the overall survival (OS) values of the key genes. One thousand eighty two DEGs were unveiled after applying statistical criterion. Functional analysis showed that overexpressed DEGs were greatly involved with epidermis development and keratinocyte differentiation; the under-expressed DEGs were principally associated with the positive regulation of nitric oxide biosynthetic process and signal transduction. The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway investigation explored that the overexpressed DEGs were highly involved with the cell cycle; the under-expressed DEGs were involved with cell adhesion molecules. The PPI network was constructed with 474 nodes and 2233 connections. CONCLUSIONS: Using the connectivity method, 12 genes were considered as hub genes. Survival analysis showed worse OS value for SFN, DSP, and PHGDH. Outcomes indicate that Stratifin may play a crucial role in the development of NSCLC.

6.
Brief Bioinform ; 22(2): 1254-1266, 2021 03 22.
Artigo em Inglês | MEDLINE | ID: mdl-33024988

RESUMO

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is accountable for the cause of coronavirus disease (COVID-19) that causes a major threat to humanity. As the spread of the virus is probably getting out of control on every day, the epidemic is now crossing the most dreadful phase. Idiopathic pulmonary fibrosis (IPF) is a risk factor for COVID-19 as patients with long-term lung injuries are more likely to suffer in the severity of the infection. Transcriptomic analyses of SARS-CoV-2 infection and IPF patients in lung epithelium cell datasets were selected to identify the synergistic effect of SARS-CoV-2 to IPF patients. Common genes were identified to find shared pathways and drug targets for IPF patients with COVID-19 infections. Using several enterprising Bioinformatics tools, protein-protein interactions (PPIs) network was designed. Hub genes and essential modules were detected based on the PPIs network. TF-genes and miRNA interaction with common differentially expressed genes and the activity of TFs are also identified. Functional analysis was performed using gene ontology terms and Kyoto Encyclopedia of Genes and Genomes pathway and found some shared associations that may cause the increased mortality of IPF patients for the SARS-CoV-2 infections. Drug molecules for the IPF were also suggested for the SARS-CoV-2 infections.


Assuntos
COVID-19/complicações , Fibrose Pulmonar Idiopática/complicações , SARS-CoV-2/genética , COVID-19/genética , COVID-19/virologia , Conjuntos de Dados como Assunto , Células Epiteliais/virologia , Ontologia Genética , Genes Virais , Humanos , Pulmão/citologia , Pulmão/virologia , Transcriptoma
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA